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Abstract 

Boolean functions can be used to express the groundness of, and trace grounding depen- 
dencies between, program variables in (constraint) logic programs. In this paper, a variety 
of issues pertaining to the efficient Prolog implementation of groundness analysis are in- 
vestigated, focusing on the domain of definite Boolean functions, Def . The systematic 
design of the representation of an abstract domain is discussed in relation to its impact 
on the algorithmic complexity of the domain operations; the most frequently called op- 
erations should be the most lightweight. This methodology is applied to Def , resulting 
in a new representation, together with new algorithms for its domain operations utilising 
previously unexploited properties of Def - for instance, quadratic-time entailment check- 
ing. The iteration strategy driving the analysis is also discussed and a simple, but very 
effective, optimisation of induced magic is described. The analysis can be implemented 
straightforwardly in Prolog and the use of a non-ground representation results in an ef- 
ficient, scalable tool which does not require widening to be invoked, even on the largest 
benchmarks. An extensive experimental evaluation is given. 

Keywords: Abstract interpretation, groundness analysis, definite Boolean functions, fix- 
point algorithms. 



1 Introduction 

Groundness analysis is an important theme of logic programming and abstract in- 
terpretation. Groundness analyses identify those program variables bound to terms 
that contain no variables (ground terms). Groundness information is typically in- 
ferred by tracking dependencies among program variables. These dependencies are 
commonly expressed as Boolean functions. For example, the function x h {y ^ z) 
describes a state in which x is definitely ground, and there exists a grounding de- 
pendency such that whenever z becomes ground then so does y. 



Groundness analyses usually track dependencies using either Fos, the class of 



positive Boolean functions (Bagnara & Schachte, 1999; Baker & S0ndergaard, 1993; 


Codish & Demoen, 1995; 


Fecht & Seidl, 199£; Marriott & S0ndergaard, 1993; 


Van 


Hentenryck et at, 1995), 


or Def, the class of definite positive fimctions (Armstrong 


et ai, 1998; Dart, 1991; 


Garcia de la Banda et ai, 1996 


; Genaim & Codish, 2001; 
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Howe & King, 200C). Pos is more expressive than Def, but studies have shown 



that Def analysers can be faster than comparable Pos analysers (Armstrong et 



1998) and, in practice, the loss of precision for goal-dependent groundness analysis 



is usually small (Heaton et al., 200C). This paper is a development of (Howe 



King, 200(1 and is an exploration of the representation of Boolean functions for 
groundness analysis and the use of Prolog as a medium for implementing all the 
components of a groundness analyser. 

The rationale for this work was to develop an analyser with conceptually sim- 
ple domain operations, with a small and simple (thus easily maintained) Prolog 
implementation based on a meta-interpreter and with performance comparable to 
that of BDD-based analysers. Moreover, since Prolog is well suited to symbolic ma- 
nipulation, it should provide an appropriate medium for implementing a symbolic 
analysis, such as groundness analysis. Any analysis that can be quickly prototyped 
in Prolog is particularly attractive. The main drawback of this approach has tra- 
ditionally been performance. The efficiency of an analyser can be guaranteed by 
including a widening (the controlled exchange of precision for scalability). How- 
ever, a successful analyser should fire widening infrequently to maximise precision. 

The efficiency of groundness analysis depends critically on the way dependen- 
cies are represented. Representation has two aspects: the theoretical representa- 
tion (HDDs, Blake Canonical Form, etc.) of the Boolean functions and the data- 
structures of the implementation language that are used to support this represen- 
tation. The theoretical representation determines the complexity of the domain 
operations, but the implementation requires the specific data-structures used to be 
amenable to efficient implementation in the chosen language. That is, the imple- 
mentation can push the complexity into a higher class, or introduce a prohibitive 
constant factor in the complexity function. This paper considers how a represen- 
tation should be chosen for the intended application (groundness analysis) by bal- 
ancing the size of the representation (and its impact) with the complexity of the 
abstract operations and the frequency with which these operations are applied. The 
paper also explains how Prolog can be used to implement a particularly efficient 
De/-based groundness analysis. The orthogonal issue of the iteration strategy used 
to drive the analysis is also considered. Specifically, this paper makes the following 
contributions: 



• A representation of DeJ formulae as non-canonical conjunctions of clauses is 
chosen by following a methodology that advocates: 1) ensuring that the most 
commonly called domain operations are the most lightweight; 2) that the 
abstractions that arise in practice should be dense; 3) that, where possible, 
expensive domain operations should be filtered by lightweight special cases. 

• A fast Prolog implementation of _De/-based groundness analysis is given 
founded on the methodology above, using a compact, factorised represen- 
tation. 

• Representing Boolean functions as non-ground formulae allows succinct im- 
plementation of domain operations. In particular a constant-time meet is 
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achieved using difference lists and a quadratic-time entailment check is built 
using delay declarations. 

• A new join algorithm is presented which does not require formulae to be 
preprocessed into a canonical form. 

• The use of entailment checking as a filter for join is described, as is the use 
of a filtered projection. 

• Various iteration strategics arc systematically compared and it is suggested 
(at least for groundness analysis) that good performance can be obtained by 
a surprisingly simple analysis framework. 

• An extensive experimental evaluation of groundness analysis using a variety 
of combinations of domains, representations and iteration strategies is given. 

• As a whole, the work presented in this paper strongly suggests that the 
implementor can produce a robust, fast, precise, scalable analyser for goal- 
dependent groundness analysis written in Prolog. The analyser presented does 
not require widening to be applied for any programs in the benchmarks suite. 

The rest of the paper is structured as follows: Section 2 details the necessary 
preliminaries. Section 3 reviews the methods used for choosing the representation of 
Def. It goes on to describe various representations of Def in relation to a frequency 
analysis of the operations; a non-canonical representation as conjunctions of clauses 
is detailed. Section 4 describes a new join algorithm, along with filtering techniques 
for join and for projection. Section 5 discusses a variety of iteration strategies 
for driving an analysis. Section 6 gives an experimental evaluation of the various 
combinations of domain representations and iteration strategy for Def (and also 
for the domains EPos and Pos). Section 7 surveys related work and Section 8 
concludes. 

2 Preliminaries 

A Boolean function is a function / : BooV Bool where n > 0. Let V denote 
a denumerable universe of variables. A Boolean function can be represented by a 
propositional formula over X where \X\ = n. The set of propositional formulae 
over X is denoted by Boolx- Throughout this paper. Boolean fmictions and propo- 
sitional formulae are used interchangeably without worrying about the distinction. 
The convention of identifying a truth assignment with the set of variables M that 
it maps to true is also followed. Specifically, a map tpx{M) : p{X) — > Boolx is 
introduced defined by: tpx{M) = (AM) A -.(V(X\M)). In addition, the formula AY 
is often abbreviated as Y. 

Definition 1 

The (bijective) map modelx '■ Boolx — > p{p{X)) is defined by: modelx{f) = 
{MCX|Vx(M)h/}- 

Example 1 

\i X = {x,y}, then the function {{true, true) i— > true, {true, false) false, 
{false, true) i— > false, {false, false) ^ false} can be represented by the formula 
a; At/. Also, model x{x /\y) = {{x,y}} and modelx{x V y) = {{x},{y}, {x,y}}. 
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The focus of this paper is on the use of sub-classes of Boolx in tracing groundness 
dependencies. These sub-classes are defined below: 

Definition 2 

A function / is positive iff X e modelx{f)- Posx is the set of positive Boolean 
functions over X. A function / is definite iff M n M' € model x{f) for aU M, M' e 
model X if) - Defx is the set of positive functions over X that are definite. A function 
/ is GE iff / is definite positive and, where Y = Hmodelxif), for all M,M' £ 
modelxif), Y U {M \ M') £ modelxif)- EPosx is the set of GE functions over X. 

Note that EPosx ^ Def ^ C Posx- One useful representational property of Def ^ 
is that each / £ Dsf x be described as a conjunction of definite (propositional) 
clauses, that is, / = A"^;^(j/i <— AYi) (Dart, 1991). Note that the y^s are not 
necessarily distinct. Finally, Def abbreviates Defy. Also notice that EPosx = 

{AF \F (ZX\J Ex], where Ex = {x^y\x,y£ X]. 

Example 2 

Suppose X = {x,y,z} and consider the following table, which states, for some 
Boolean functions, whether they are in EPosx, Defx Posx and also gives 
model X ■ 



f 



EPosx Defx Posx 



modelxif) 



false 
X Ay 
xV y 
X 

xW {y ^ z) 
true 



{ {x,y}, {x,y,z}} 

{ {x},{y}, {x,y}, {x,z}, {y,z}, {x,y,z}} 
{9,{x}, {z}, {x,y}, {x,z}, {x,y,z}} 
{9, {x}, {y}, {x, y}, {x, z}, {y, z}, {x, y, z}} 
{0, {x}, {y}, {z}, {x,y}, {x,z}, {y,z}, {x,y,z}} 



Note, in particular, that x W y is not in Def x (since its set of models is not closed 
under intersection) and that false is neither in EPosx, nor Posx nor Def x- 

Defining /1V/2 = A{/ £ Defx I h h / A /2 h /}, the 4-tuple {Defx, A, V) is 
a finite lattice ( Armstrong et al, 199^ ), where true is the top element and AX is 
the bottom element. Existential quantification is defined by Schroder's Elimination 
Principle, that is, 3a;./ = f[x 1— > true]\/f[x 1— > false]. Note that if / £ Defx then 



3x.f £ Defx (lArmstrong et al, 19981) 



Example 3 

If A" = {x,y} then x\/{x ^ y) = A{{x ^ y),true} = (a; <— y), as can be seen 
in the Hasse diagram for dyadic Defx (Fig. 0). Note also that xVy = A{true} = 
true 7^ {xM y). 

The set of (free) variables in a syntactic object o is denoted by var{o). Also, 

3{yi, . . . , y„}./ (project out) abbreviates 3yi ^yn-f and 3Y.f (project onto) 

denotes 3var{f)\Y.f . Let pi, p2 be fixed renamings such that Anpi(A) = Xnp2{X) 
= pi{X) n p2iX) = 0. Renamings are bijective and therefore invertible. 

Downward closure, j, relates Pos and Def and is useful when tracking sharing 
with Boolean functions ( |Codish et al, 199£ ). It is defined by If = modelx^{{nS \ 



true 
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X Ay 

Fig. 1. Hasse Diagrams 



X ^ y y ^ X 



X Ay 



true 

X -ir— y xVy y ^ X 

X ^ y V 



X Ay 

P0S{x,y} 



C S* C model x{f)})- Note that [f has the useful computational property that 
If = A{f' e Defx I / h /'} if / G Posx- That is, j takes a Pos formula to its 
best Def approximation. Finally, for any / G Boolx, coneg{ f) — model^^ {{X \ M \ 
M e modelxif)}) ( podish et ai, 19991 ). 

The following pieces of logic programming terminology will also be needed. Let 
T denote the set of terms constructed from V and a set of function symbols F. H 
is a set of predicate symbols. An equation e is a pair (s — t) where s, i e T. A 
substitution is a (total) map 6 -.V —* T such that {w G F | 9{v) ^ v} is finite. Let 
Sub denote the set of substitutions and let E denote a finite set of equations. Let 
6{t) denote the term obtained by simultaneously replacing each occurrence of v in 
t with e{v), and let e{E) = {e{s) = e{t) \{s = t)e E}. 

Composition of substitutions induces the (more general than) relation < defined 
by: 6* < -0 if there exists 5 G Sub such that ip = S o 9. More general than lifts to 
terms by s < i iff there exists S G Sub such that S{s) — t. The set of unifiers of 
E, unify{E), is defined by: unify{E) = {6* G Sub \ V(s = i) G E.e{s) ^ e{t)} and 
the set of most general unifiers, mgu{E), is defined by: mgu{E) — {6 ^ unify(E) \ 
G unify{E).d < V'}- Finally, the set of generalisations of two terms is defined 
by: gen{ti,t2) ^ {t E T\t < ti At < ^2} and the set of most specific generalisations 
is defined by: msg{ti,t2) ~ {t E gen{ti,t2)\\/s G gen{ti,t2)-s < t}. 



3 Choosing a Representation for Def 

3.1 Review of Design Methods 

The efficiency of an analyser depends critically on the algorithmic complexities of 
its abstract domain operations. These in turn are determined by the representation 
of the abstract domain. The representation also determines the size of the inputs 
to the domain operations, as well as impacting on memory usage. Because of this, 
the choice of representation is fundamental to the efficiency of an analyser and 
is therefore of great importance. The remainder of this subsection reviews three 
factors which should help the implementor arrive at a suitable representation and 
suggest where domain operations might be refined. 



3.1.1 Frequency Analysis of the Domain Operations 



There are typically many degrees of freedom in designing an analyser, even for a 
given domain. Furthermore, work can often be shifted from one abstract operation 
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into another. For example, Boolean formulae can be represented in either conjunc- 
tive normal form (CNF) or disjunctive normal form (DNF). In CNF, conjunction 
is constant-time and disjunction is quadratic-time, whereas in DNF, conjunction 
is quadratic-time and disjunction is constant-time. Ideally, an analysis should be 
designed so that the most frequently used operations have low complexity and are 
therefore fast. This motivates the following approach: 

1. Prototype an analyser for the given domain. 

2. Instrument the analyser to count the number of times each domain operation 
is invoked. 

3. Generate these counts for a number of programs (the bigger the better). 

4. Choose a representation which gives a good match between the frequency and 
the complexity of the domain operations. 

Because the frequency analysis is solely concerned with generated instruction counts, 
the efficiency of the prototype analyser is not a significant issue. The objective is 
to choose a representation for which the most frequently occurring operations are 
also the fastest. However, this criterion needs to be balanced with others, such as 
the density of the representation. 

3.1.2 Density of the Domain Representation 

The complexity of the domain operations is a function of the size of their inputs. 
Large inputs nullify the value of good complexities, hence a balance between size 
of representation and complexity of domain operations is needed. The following 

factors impact on this relationship: 

1. The abstractions which typically arise should be represented compactly. 

2. A factorised representation with an expressive, high density, low maintenance 
component is attractive. 

3. Maintaining the representation (for example, as a canonical form) should not 
come with a high overhead. 

4. The representation should fit with machinery available in the implementation 
language. 

A domain is said to be factorised if its information is represented as a product of 

subdomains. It may not always be possible to fulfill all these requirements. More- 
over, these factors needs to balanced with others, such as their impact on the 
complexities of frequently called domain operations. 

3.1.3 Filtering the Domain Operations 

In many analyses it is inevitable that some domain operations will have high com- 
plexity. However, it is sometimes possible to reduce the impact of this by filtering 

the operation, as follows: 

1. For a high complexity domain operation identify special cases where the op- 
eration can be calculated using a lower complexity algorithm. 
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2. Instrument the analyser to quantify how often the lower complexity algorithm 
can be applied. 

3. If it appears that the special case occurs frequently, then implement the special 
case and measure the impact on performance. 

The bottom line is that the cost of detecting the special case should not outweigh 
the benefit of applying the specialised domain algorithm. 



3.2 Frequency Analysis for Def 



In order to balance the frequency of abstract operations against their cost, an 
existing Def analyser was instrumented to count the number of calls to the var- 
ious abstract operations. The analyser used for this is based on Armstrong and 



Schachte's BDD-based domain operations for Pos and Sharing ( Armstrong et al 
1998 ; Bchachte, 1999| ). Using the domain operations provided for these domains, a 



Def analyser can easily be derived. This analyser is coded in Prolog as a simple 
mcta-intcrprcter that uses induced magic-sets ( ICodish, 1999a| ) and eager evaluation 
( Wundcrwald, 199^ ) to perform goal-dependent bottom-up evaluation and call the 
C implemented domain operations. 

Induced magic is a refinement of the magic set transformation, avoiding much 
of the re-computation that arises because of the repetition of literals in the bodies 
of magicked clauses ( [Codish, 1999a ). Eager evaluation ( Wunderwald, 1995 ) is a 
fixpoint iteration strategy which proceeds as follows: whenever an atom is updated 
with a new (weaker) abstraction, a recursive procedure is invoked to ensure that 
every clause that has that atom in its body is re-evaluated. An advantage of induced 
magic is that it can be coded straightforwardly in Prolog. 

Table 1 gives a breakdown of the relative frequency (in percentages) of the calls 
to each abstract operation in the BDD-based Def analysis of eight large programs. 
Meet, join, equiv, project and rename are the obvious Boolean operations. Join 
(diff) is those calls to a join /1V/2 where /1V/2 7^ /i and /1V/2 ^ /2 (this will 
be useful in section Total details the total number of calls to these domain 
operations. 



file 


rubik 


ciiat_parser 


sim_v5-2 


peval 


aircraft 


essln 


chat_80 


aqua_c 


meet 


30.9 


31.6 


35.9 


32.5 


28.5 


42.7 


34.0 


34.2 


join 


10.4 


10.4 


8.8 


9.7 


11.1 


8.4 


10.2 


10.5 


join (diflf) 


1.1 


1.7 


0.0 


2.9 


0.1 


0.9 


1.5 


1.6 


equiv 


10.4 


10.4 


8.8 


9.7 


11.1 


8.4 


10.2 


10.5 


project 


12.6 


12.5 


13.0 


12.5 


13.0 


10.5 


12.1 


11.7 


rename 


34.7 


33.4 


33.6 


32.8 


36.2 


29.2 


32.0 


31.6 


total 


14336 


14124 


5943 


6275 


24758 


19051 


45444 


280485 



Table 1. Frequency Analysis: BDD-based I?e/ Analyser (Figures in %) 



Observe that meet and rename are called most frequently. Join, equiv and project 
are called with a similar frequency, but less frequently than meet and rename. Note 
that it is rare for a join to differ from both its arguments. Join is always followed 
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by an equivalence and this explains why the join and equiv rows coincide. Since 
meet and rename are the most frequently called operations, ideally they should be 
the most lightweight. As join, equiv and project are called less frequently, a higher 
algorithmic complexity is more acceptable for these operations. 



3.3 Representations of Def 

This section reviews a number of representations of Def in terms of the algorithmic 
complexity of the domain operations. The representations considered are reduced 
ordered binary decision diagrams, dual Blake canonical form (specialised for Def 



(Armstrong et al, 1998)) and a non-canonical definite propositional clause repre- 



sentation. 

ROBDD A reduced ordered binary decision diagram (ROBDD) is a rooted, di- 
rected acyclic graph. Terminal nodes are labelled or 1 and non-terminal nodes 
are labelled by a variable and have edges directed towards two child nodes. 
ROBDDs have the additional properties that: 1) each path from the root to 
a node respects a given ordering on the variables, 2) a variable cannot occur 
multiply in a path, 3) no subBDD occurs multiply. ROBDDs give a unique rep- 
resentation for every Boolean function (up to variable ordering). 

DBCF Dual Blake Canonical Form (DBCF) represents Def functions as conjunc- 
tions of definite (propositional) clauses ( Armstrong et al., 1998| ; Dart, 1991 



Garcia de la Banda et al., 1996| ) maintained in a canonical (orthogonal) form 



that makes explicit transitive variable dependencies and uses a reduced mono- 
tonic body form. For example, the function (a; ^ y) A (y <— z) is represented as 
(x <— (y V z)) A (y <— z). Again, DBCF gives a unique representation for every 
Def function (up to variable ordering). 
N on- canonical The non-canonical clausal representation expresses Def functions as 
conjunctions of propositional clauses, but does not maintain a canonical form. 
This does not give a unique representation. 

Table 2 details the complexities of the domain operations for Def for the three 
representations. Notice that the complexities are in terms of the size of the repre- 
sentations and that these are all potentially exponential in the number of variables. 
Also, observe that since DBCF maintains transitive dependencies, whereas the non- 
canonical representation does not, the DBCF of a Def function has the potential 
to be considerably larger than the non-canonical representation. As ROBDDs are 
represented in a fundamentally different way, their size cannot be directly compared 
with clausal representations. 

Both ROBDDs and DBCF are maintained in a canonical form. Canonical forms 
reduce the cost of operations such as equivalence checking and projection by fac- 
toring out search. However, canonical forms need to be maintained and this main- 
tenance has an associated cost in meet and join. That is, ROBDDs and DBCF buy 
low complexity equivalence checking and projection at the cost of higher complexity 
meet and join. 
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Representation 


meet 


join 


equiv 


rename 


project 


ROBDD 
DBCF 

Non-canonical 


O(iV^) 
0{N*) 
Oil) 


O(iV^) 
0(2^^) 
0(2^^) 


0(1) 
0{N) 
0{N'^) 


0{N') 
OiN) 
OiN) 


0{N^) 
0{N) 
0(2^) 



Table 2. Complexity of Def Operations for Various Representations (where N is 
the size of the representation - number of nodes/ variable occurrences). 

As discussed in the previously, the lowest cost operations should be those that 
are most frequently called. Table 1 shows that for Def based groundness analysis, 
meet and renaming are called significantly more often than the other operations. 
Hence these should be the most lightweight. This suggests that the non-canonical 
representation is better suited to DeJ-based goal-dependent groundness analysis 
than ROBDDs and DBCF. The following sections will detail the non-canonical 
representation. 



3.4 GEP Representation 

This section outlines how the non-canonical representation is used in an analysis 
for call and answer patterns. Implementing call and answer patterns with a non- 
ground representation enables the non-canonical representation to be factorised at 
little overhead. 

A call (or answer) pattern is a pair (a, /) where a is an atom and / G Def . 
Normally the arguments of a are distinct variables. The formula / is a conjunction 
(list) of prepositional clauses. In a non-ground representation the arguments of a 



can be instantiated and aliased to express simple dependency information (Heaton 



et al., 2000 ). For example, if a = p{xi, ...,2:5), then the atom p(xi, true, xi,Xi, true) 
represents a coupled with the formula {xi <-> 2:3) A a;2 A X5. This enables the 
abstraction (p(xi, xs), (xi ^ 2:3) A a;2 A 2:5 A (2:4 — ^ 2:1)) to be collapsed to 
{p{xi, true, xi,X4, true), X4 — > 2;i). This encoding leads to a more compact repre- 
sentation and is similar to the GER factorisation of ROBDDs proposed by Bagnara 



and Schachte ( Bagnara fc Schachte, 1999 ). The representation of call and answer 



patterns described above is called GEP (groundness, equivalences and propositional 
clauses) where the atom captures the first two properties and the formula the latter. 

Formally, let GEP = {{p{ti, ...,t„), f) \pell,t, G F U {true}, / e {Def\GE)U 
{true}}. Define ^ by (p(ai),/i) |= (p(a2),/2) iff 3x.{{di ^ x) A /i) |= 3x.{{a2 ^ 
f)A/2) a,ndvar{x)r]{var{ai)Uvar{a2)Uvar{fi)Uvar{f2)) = 0. Then {GEP, ^) is a 
preorder. The preorder induces the equivalence relation = defined by (p(ai),/i) = 
(p(a2),/2) iff (p(5i),/i) h (Ma2),/2) and (p(a2),/2) h Let GEP^ 

denotes GEP quoticnted by the equivalence. Define A : GEP= x GEP= GEP= 
by [{ai,fi)h A [{02, f2)h - [(f?(ai), ^(/i) A ^(/a))]^, where 9 e mgu{ai,a2). Then 
{GEP=, ^, A) is a finite lattice. 

The meet of the pairs {p{ai), fi) and (^(02), /2) can be computed by unifying ai 
and a2 and concatenating /i and /2. The unification is nearly linear in the arity of 



p (using rational tree unification (Jaffar, 1984)) and concatenation is constant-time 



(using difference lists). Since the arguments ai and 02 are necessarily distinct, the 
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analyser would unify di and 02 even in a non-factorised representation, hence no 
extra overhead is incurred. The objects that require renaming are formulae and 
call (answer) pattern GEP pairs. If a dynamic database is used to store the pairs 
(Hcrmcnegildo et ai, 1992), then renaming is automatically applied each time a 
pair is looked-up in the database. Formulae can be renamed with a single call to 
the Prolog builtin copy_term. Renaming is therefore linear. 

The GEP factorisation defined above is true, that is, all the GE dependencies are 
factored into the atom. An alternative definition would be GEP = {{p{ti, tn), f) \ 
p Gll,ti G VU{true}, f £ Def}. Here the factorisation is not necessarily true, in the 
sense that GE dependencies may exists in the P component, e.g. {p{x, x, true), true) 
may also be correctly expressed as {p{u, v, w),{u <-> v)Aw). A non-true factorisation 
may be advantageous when it comes to implementing the domain and from hence- 
forth GEP will refer to the non-true factorisation version unless stated otherwise. 
The P component may contain redundant (indeed, repeated) clauses and these may 
impact adversely on performance. In order to avert unconstrained growth of P, a 
redundancy removal step may be applied to P at a convenient point (via entail- 
ment checking) . Since the non-canonical formulae do not need to be maintained in a 
canonical form and since the factorisation is not necessarily true, the representation 
is flexible in that it can be maintained on demand, that is, the implementor can 
choose to move dependencies from P into GE at exactly those points in the analysis 
where true factorisation gives a performance benefit. 



4 Filtering and Algorithms 

The non-canonical representation has high cost join and projection algorithms. 
Therefore it is sensible to focus on improving the efficiency of these operations. 
This is accomplished through filtering following the strategy described in section 
3.1. This section presents a new approach to calculating join and describes the use 
of entailment checking as a filter in the join algorithm. It also describes a filtering 
method for projection. 



4.1 Join 

This section describes a new approach to calculating join, inspired by a convex 



hull algorithm for polyhedra used in disjunctive constraint solving (Dc Backer & 



Beringer, 1993| ). The new join algorithm is first described for formulae and is then 
lifted to the GEP representation. 



4-. 1.1 Join for Formulae 

Calculating join in Def is not straightforward. It is not enough to take the join 
each possible pair of clauses and conjoin them - transitive dependencies also need 
to be taken into account. This is illustrated by the following example (adapted from 
( [Armstrong et al, 199^ )). 
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Example 4 

Put fi = {x^u)A{u^ y) and f2 = {x ^ v) A {v <~ y). Then /1V/2 = (x ^ 
{u A w)) A (x <— y). The clause {x {u A v)) comes from {x <— w)V(x ^ v), but the 
clause X ^ y is not the result of the join of any pair of clauses in /i, /2- It arises 
since /i \^ x ^ y and f2 \= x y^ that is, from clauses which appear in transitive 
closure. 

One way in which to address the problem of ensuring that the transitive dependen- 
cies are captured is to make the explicit in the representation (this idea is captured 
in the orthogonal form requirement of ( Armstrong et al, 1998[ )). However, this leads 



to redundancy in the formula which ideally should be avoided. 

It is insightful to consider V as an operation on the models of fi and /2 . Since both 
model X if i) are closed under intersection, V essentially needs to extend model x if i)'^ 
modelxif2) with new models Mi where Mi G modelxifi) to compute /1V/2. 

The following definition expresses this observation and leads to a new way of com- 
puting V in terms of meet, renaming and projection, that does not require formulae 
to be first put into orthogonal form. 

Definition 3 

The map Y : Boolx^ Boolx is defined by: /1Y/2 = BY.fiY f2 where Y = 
var{fi)Uvarif2) and /i Y/2==pi(/i) Ap2(/2) A Ayery *^ ipiiv) ^P2iy))- 

The following example illustrates the Y operator. 

Example 5 

Let /i = (x ^ u) A (m ^ y), f2 = ix ^ v) A iv ^ y). Then Y = {m, u, x,?/}. 
The following substitutions rename the functions apart, pi — {u ^ u' , w > w', x ^— > 
x', y I— > 2/'}, p2 = {u u", V ^—^ v" , X i— > x", y ^ y"}. Using Definition 3, fiYf2 — 
(x' ^ u') A (u' ^ y') A ix" v)" A [v" <- y") A u ^ (u' A u") Av ^ iv' A v") Ax ^ 
(x' A x") Ay ^ {y' A y"). Projection onto Y gives /1Y/2 = 3{u, v, x, y}-fi Y /2 = 
(x ^ (u A v)) A (x <— y). 

Note that Y operates on Boolx rather than Def x- This is required for the downward 
closure operator in section 5.3. Lemma ^ expresses a key relationship between Y 
and the models of /i and f2- 

Lemma 1 

Let /i, /2 G Boolx - M G modelxifi'^ f 2) if and only if there exists Mi G modelxifi) 
and M2 G modelxif2) such that M = Mi n M2. 

Proof 

Put X' = X\JpiiX)\Jp2iX). Let M G morfe/x(/i Y/2). There exists M C M' C X' 
such that M' G modelx'ifi Y /2). Let M, = p^^^iM' n Pi(F)), for i G {1,2}. Thus 
Mi G modelxifi) for i G {1, 2}. Observe that M C Mi n M2 since /i Y /2 |= 2/ ^ 
ipiiy)Ap2iy)). Also observe that MinM2 C M since /1Y/2 h iPiiv) ^P2iy)) ^ y- 
Thus M = Ml n M2, as required. 

Let Mi G modelxifi) for z G {1,2} and put M = Mi n M2 and M' M U 
pi(Mi) U pi(M2). Observe M' G modelx'ifi Y /2) so that M G morfeZ^ (/1Y/2). 

□ 
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From lemma |^ flows the following corollary and also the useful result that Y is 
monotonic. 

Corollary 1 

Let / e Posx. Then / = /Y/ if and only if / G Defx- 
Lemma 2 

Y is monotonic, that is, /1Y/2 \= /1Y/2 whenever /i \= f{ and /2 \= f2- 
Proof 

Let M G modelx if f 2)- By lemma |l|, there exist Mi e modelx{fi) such that 
M = Ml n Ah- Since ^ e modelx{f[) and hence, by lemma 0, Af G 

modelxifirf^). □ 

The following proposition states that Y coincides with V on Defx- This gives a 
simple algorithm for calculating V that does not depend on the representation of a 
formula. 

Proposition 1 

Let /i, /2 e 7?e/;f . Then /1Y/2 = /1V/2. 
Proof 

Since X \^ f2 it follows by monotonicity that /i = fi^X \= /1Y/2 and similarly 
/2 h /i>^/2. Hence /1V/2 h /i>^/2 by the definition of V. 

Now let M G modelx if ly f 2)- By lemma |^, there exists Mi G modelxifi) such 
that M = Afi n A/2 G modelx if iVf2)- Hence /i Y/2 h /iV/2- □ 



^J.^ Join for GEP 

Join, V : GEP= x GEP= — > GEP=, in the GEP representation can be defined in 
terms of A and |= in the usual way, i. e. 



In practice quotienting manifests itself through the dynamic database. Each time 
a pattern is read from the database it is renamed. Join is lifted to quotients by 
reformulated GEP pairs as follows: (p(ai), /i) becomes {p{a), (a ^ ai) A /i) where 
p{a) ~ msg{p{ai),p{a2)). p{d) is computed using Plotkin's anti-unification algo- 



rithm in 0{N log(A^)) time, where N is the arity of p ( Plotkin, 1970 ). The following 



lemma formalises this lifting of the join algorithm to the GEP representation. 
Lemma 3 

[(p(fi),/ij]= V Mt2),f2)U = M^Afi A (fi ^ i))Y(/2 A {h ^ where 
t G msg{ti,t2)- 
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Proof 

The first equality liolds by the definition of = in GEP= , the second by the definition 
of join in GEP=, the third by the definition of \= in GEP=, the fourth by the 



In section 3.3 it was observed that some high complexity domain operations have 
special cases where the operation can be calculated using a lower complexity algo- 
rithm. Join for Def in the non-canonical GEP representation is one such operation. 
Specifically, V is exponential (see Table 2), however, if /i ^ /2, then /1V/2 = /2- 
Entailment checking is quadratic in the number of variable occurrences (using a 
forward chaining algorithm), hence by using this test, join can be refined. Table 1 
shows that the majority of calls to join will be caught by the cheaper entailment 
checking case. The following proposition explains how this filtering is lifted to the 
GEP representation. Observe that this proposition has three cases. The third case 
is when the entailment check fails. The first case is when entailment checking re- 
duces to a lightweight match on the GE component followed by an entailment check 
on the P component. The second case is more expensive, requiring a most specific 
generalisation to be computed as well as an entailment check on more complicated 
formulae. In the context of the analyser, the pair [(^(^2), /2)]= corresponds to an 
abstraction in the database and these abstractions have the property that the vari- 
ables in the P component are contained in those of the GE component. This is 
not necessarily the case for [(p(ti), /i)] = , since in the induced magic framework /i 
represents the state of the variables of the clause to the left of the call to p{ti). 
Variable disjointness follows since renaming automatically occurs every time a fact 
is read from the dynamic database. 

Proposition 2 

Suppose war(/2) C var{p{t2)) and var{{p{ti), /i)) ("1 var{{p{t2), f2)) = 0- Then, 




M^,A{f' e Def I (fi ^ A A h /', ih t) A /2 h /'})] 
[(p(i), (/i A (fi ^ t))Y(/2 A ih ^ 



□ 



4-2 Filtering Join using Entailment Checking 
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Mti),h)hvMt2),h)h 



e e mgu{p{ti),p{t2)), 

0{h) N 0{f2) 
[{p{t2),f2)U if emsg(p(fi),p(f2)), 

/lA(fi^t) h/2A(r2<->t) 

[{p{i),f}]- otherwise where p{i) = msg{p{ti),p{t2)), 

/ = (/i A ih ^ i))Y(/2 A it2 ^ t)) 



Proof 
Case 1 





0{fi) 


h 


e{.f2) 








h 


{9{t2) ^x)A 9{f2) 


by assumption 






h 


i9{t2) ^x)A 9{f2) 


h = 9{t2) - 0(fi) 




{h ^x) Afi 


h 


{9jt2) ^x)A 9{f2) 


var{fi) n var(t2) = 




{h ^x) Afi 


h 


{t2 ^X) A f2 


\= is transitive 




3x.{{ti ^x) Afi) 


h 


3x.{{t2 ^x) Af2) 


3 is monotonic 






h 




by definition 


Case 2 











(fi ^ i) A /i h it2 ^ A /2 
x) A {h ^ A /i 1= (f^ x) A {t2 ^t) Af2 
3x.{{t'^ x) A {h ^t) A /i) h 3x.{{t^ x) A {h ^ i) A /2) 
=^ 3x.((ri ^x)A fi) h 3f.((r2 ^ A /2) 

Case 3 Immediate from lemma |3[ □ 



3 is monotonic 
since x are fresh 
by definition 



A non-ground representation aUows chaining to be implemented efficiently using 
block declarations. To check that A^^^yi <— Yi entails z Z the variables of Z are 
first grounded. Next, a process is created for each clause yi ^ Yi that suspends until 
Yi is ground. When Yi is ground, the process resumes and grounds yi.li z is ground 
after a single pass over the clauses, then (A"^iyi ^Yi)\=z<^Z. Suspending and 
resuming a process declared by a block is constant-time (in SICStus). By calling the 
check under negation, no problematic bindings or suspended processes are created. 



4-3 Downward Closure 

A useful spin-off of the join algorithm in section 5.1 is a result that shows how to 
calculate succinctly the downward closure operator that arises in Pos-based sharing 
analysis (Codish et ai, 1999). Downward closure is closely related to Y and, in fact, 
Y can be used repeatedly to compute a finite iterative sequence that converges to 
|. This is stated in proposition Finiteness follows from bounded chain length of 
Posx- 
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Proposition 3 

Let / S Posx- Then If = \/i>ifi where fi e Posx is the increasing chain given 
by: /i = / and fi+i = fiYfi. 

Proof 

Let M e modelxUf)- Thus there exists M, G modelxif) such that M = U^iM,-. 
Observe Ml nM2, M3nM4, . . . e modelx{f2) and therefore M G '7iO(ie/x(/|-iog2(m)] )■ 
Since m < 2^ where n— \X\ it follows that [f \= /2". 

Proof by induction is used for the opposite direction. Observe that /i \=[f . Sup- 
pose fi \=if. Let M G modelx{fi+i)- By lemma |l| there exists Mi, il/2 G modelxifi) 
such that M = Ml n M2. By the inductive hypothesis Mi, M2 G niodelxilf) thus 
M G niodelxilf). Hence /j+i ^j/. 

Finally, Vi>i/i G Defx since /i G Posx and Y is monotonic and thus X G 
model x{yi>ifi)- □ 

The significance of this is that it enables [ to be implemented straightforwardly 
with standard domain operations. This saves the implementor the task of coding 
another domain operation. 

4.4 Projection 

Projection is only applied to the P component of the GEP representation (since 
projection is onto the variables of the GE component). Projection is another ex- 
ponential operation. Again, this operation can be filtered by recognising special 
cases where the projection can be calculated with lower complexity. The projection 
algorithm implemented is based on a Fourier-Motzkin style algorithm (as opposed 
to a Schroder variable elimination algorithm). The algorithm is syntactic and each 
of the variables to be projected out is eliminated in turn. The first two steps collect 
clauses with the variable to be projected out occurring in them, the third performs 
the projection by syllogising and the fourth removes redundant clauses. Suppose 
that / = AF, where F is a set of clauses, and suppose x is to be projected out of 
/• 

1. All those clauses with x as their head are found, giving H = {x ^ Xi | i G /}, 
where / is a (possibly empty) index set. 

2. All those clauses with x in the body are found, giving B — {y ^ Yj | j G J}, 
where J is a (possibly empty) index set and x G Yj for each j G J. 

3. Let = X^U (Yj \ {x}). Then N ^ {y ^ Zi^j \ i e I A j e J A y Zi^j} 
(syllogising). Put F' = {{F \H)\B)LIN. (Then 3x.f ^ AF'.) 

4. A compact representation is maintained by eliminating redundant clauses 
from F' (compaction). 

All four steps can be performed in a single pass over /. A final pass over / retracts 
clauses such as a; <— true by binding x to true and also removes clause pairs such 
as y ^ z and z <— y by unifying y and z. 

At each pass the cost of step 4, the compaction process, is quadratic in the size 
of the formula to be compacted (since the compaction can be reduced to a linear 
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number of entailment checks, each of which is hnear). The point of compaction is to 
keep the representation small. Therefore, if the result of projecting out a variable 
(prior to compaction) is smaller than the original formula, then compaction appears 
to be unnecessary. Thus, step 4 is only applied when the number of clauses in the 
result of the projection is strictly greater than the number of clauses in the original 
formula. Notice also that in the filtered case the number of syllogisms is linear in 
the number of occurrences of the variable being projected out. Table 3 details the 
relative frequency with which the filtered and compaction cases are encountered. 
Observe that the vast majority of cases do not require compaction. Finally notice 
that join is defined in terms of projection, hence the filter for projection is inherited 
by join. 



file 


strips 


chat_parser 


sim_v5-2 


peval 


aircraft 


essln 


chat_80 


aqua_c 


filt 


100.0 


99.8 


100.0 


97.4 


100.0 


99.4 


99.7 


96.1 


elim 


0.0 


0.2 


0.0 


2.6 


0.0 


0.6 


0.3 


3.9 



Table 3. Frequency Analysis of Compaction in Projection (induced magic) 

Notice that filtered algorithms break up an operation into several components of 
increasing complexity. The filtered algorithm then suggests natural places at which 
to widen, i. e. the high complexity component is widened from above using a cheap 
approximation. This approximation might be acceptable since the high complexity 
case will be called infrequently. For example, widening might be used to improve 
the worst case complexity of projection (and hence join) for non-canonical Def . 



5 Implementation of the Iteration Strategy 



Sections 3 and 4 are concerned with the representation of the abstract domain and 
the design and implementation of domain operations. The overall efficiency of an 
analyser depends not only on these operations, but also on the iteration strategy em- 
ployed within the fixpoint engine. A fixpoint engine has to trade off the complexity 
of its data-structures against the degree of recomputation that these data-structures 



factor out. For example, semi-naive iteration (Bancilhon & Ramakrishnan, 1986) 
has very simple data-structures, but entails a degree of recomputation, whereas 



PLAI ( Hcrmenegildo et a/., 2000 ) tracks dependencies with dynamically generated 
graphs to dramatically reduce the amount of recomputation. 

Fixpoint engines with dependency tracking which have been applied to logic pro- 



gramming analyses include: PLAI (Hcrmenegildo et ai, 2000), GAIA (Le Charlier 



fc Van Hentcnryck, 1994D, the CLP(7^) engine ([Kelly et a/., 1998D and GENA ( |Fecht 
fc Seidl, 1996|; [Fecht, 1997|; [Fecht fc Seidl, 1999|). An ahernative to on-the-fly de 



pendency tracking is to use semi-naive iteration driven by a redo worklist detailing 
which call and answer patterns need to be re-evaluated and (possibly) in which 



order. One instance of this is induced magic (Codish, 1999a) under eager evalua- 



tion (Wunderwald, 1995), which factors out much of the recomputation that arises 
through magic transformation. Other instances use knowledge of the dependencies 
to help order the redo list and thereby reduce unnecessary computation - this is 
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typically done by statically calculating SCCs (Gallagher & de Waal, 1994), possibly 
recursively (Bourdoncle, 1993), on the call graph or on the call graph of the magic 
program. 

The benefit of reduced recomputation is dependent upon the cost of the abstract 
domain operations. Therefore the sophistication of the iteration strategies of en- 
gines such as PLAI and GENA is of most value when the domain operations are 
complex. The present paper has designed its analysis so that heavyweight domain 
operations are infrequently called, hence an iteration strategy employing simpler 
data-structures, but possibly introducing extra computation, is worthy of consid- 
eration. The analyser described in ( Howe fc King, 2000 ) used induced magic under 
eager evaluation. The current analyser builds on this work by adopting tactics in- 
spired by PLAI, GAIA and GENA into the induced magic framework. Importantly 
these tactics require no extra data-structures and little computational effort. Ex- 
perimental results suggest that this choice of iteration strategy is well suited to 
De/-based groundness analysis. 



5.1 Ordered Induced Magic 



Induced magic was introduced in (Codish, 1999a), where a meta-interpreter for 



semi-nai've, goal-dependent, bottom-up evaluation is presented. The analyser de- 



scribed in (Howe & King, 2000) implements a variant of this scheme using eager 
evaluation. In that paper, eager evaluation was implemented without an explicit 
redo list as follows: each time a new call or answer pattern is generated, the meta- 
interpreter invokes a predicate, solve, which re-evaluates the appropriate clauses. 
The re-evaluation of a clause may in turn generate new calls to solve so that one 
call may start before another finishes. The status of these calls is maintained on the 
stack, which simulates a redo list. Henceforth, this strategy is referred to as eager 
induced magic. 

As noted by other authors, simple optimisations can significantly impact on per- 



formance. In particular, as noted in ( Hermenegildo et ai, 2000| ), evaluations result- 



ing from new calls should be performed before those resulting from new answers, 
and a call to solve for one rule should finish before another call to solve for an- 
other rule starts. These optimisations cannot be integrated with stack based eager 
evaluation because they rely on reordering the calls to solve. Hence a redo list is 
reintroduced in order to make these optimisations. 

The meta-interpreter listed in Fig. 2 illustrates how a redo list can be integrated 
with induced magic. Four of the predicates are represented as atoms in the dy- 
namic database: redo/2, the redo list; fact/4, the call and answer patterns, where 
propositional formulae are represented as difference lists - specifically, the fourth 
argument is an open list with the third argument being its tail; head_to_clause/2, 
which represents the head and body for each clause; atom_to_clause/4, which rep- 
resents the clauses with a given atom in the body. Before invoking oim_solve/0, a 
call to cond_assert/3 is required. This has the effect of adding the top-level call to 
the fact/4 database and adding the call pattern to the redo/2 database, thereby 
initialising the fixpoint calculation. Evaluation is driven by the redo list. If the 
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redo list contains call patterns, then the first (most recently introduced) is removed 
and call_solve/l is invoked. If the redo list contains only answer patterns, then 
the first is removed and control is passed to answ_solve/l. The meta-interpreter 
terminates (with failure) when the redo list is empty. 

The predicate call_solve/l re-evaluates those clauses whose heads match a 
new call pattern. It first looks up a body for a clause with a given head followed 
by the current call pattern for head, then solves the body in induced magic fashion 
with solvejright/3. If cond_assert/3 is called with a call (answer) pattern that 
does not entail the call (answer) pattern in fact/4, then it succeeds, updating 
fact/4 with the join of the call (answer) patterns. In this event, the new call 
(answer) pattern is added to the beginning of the redo/2 database. The predicate 
answ_solve/ 1 re-evaluates those clauses containing a body atom which matches a 
new answer pattern. It looks up a clause with a body that contains a given atom, 
solves the body to the left of the atom and then to the right of the atom. If a new 
call pattern is encountered in solve_right/4, then the evaluation of the clause is 
aborted, as the new call may give a new answer for this body atom. In this situation, 
calculating an answer for the head with the old body answer will result in an answer 
that needs to be re-calculated. To ensure that the clause is re-evaluated, an answer 
for the body atom is put in the redo list by redo_assert/2. This iteration strategy 
is referred to as ordered induced magic. 



5.2 SCC-based Strategies 

In order to assess the suitability of ordered induced magic as a fixpoint strategy 
for Def-haseA groundness analysis, it has been compared with a variety of popular 
SCC-based methods. The fixpoint engine can be driven either by considering the 
top-level SCCs (|Gallagher fc de Waal, 1991 ) or by considering the recursive nesting 



of SCCs, for example ( Bourdoncle, 1993 ). The SCCs can be statically calculated 



either on the call graph of the magicked program or on the call graph of the original 
program. 

SCCs for the call graph of the magicked program (in topological order) are cal- 



culated using Tarjan's algorithm (Tarjan, 1972). The fixpoint calculation then pro- 



ceeds bottom-up, stabilising on the (call and answer) predicates in each SCC in 
topological order. If an SCC contains a single, non-recursive, (call or answer) pred- 
icate, then the predicate must stabilise immediately, hence no fixpoint check is 
needed. This strategy is henceforth referred to as SCC magic. 

A more sophisticated SCC-based tactic is to calculate SCCs within an SCC, as 



suggested by Bourdoncle (Bourdoncle, 1993). The 'recursive strategy' described by 



Bourdoncle recursively applies Tarjan's algorithm to each non-trivial SCC having 
removed an appropriate node (the head node) and corresponding edges. The fixpoint 
calculation proceeds bottom-up, stabilising on the (call and answer) predicates in 
each component recursively. The fixpoint check need only be made at the head 
node. This is strategy has potential for reaching a fixpoint in a particularly small 
number of updates. This strategy is henceforth referred to as Bourdoncle magic. 
Since both SCC magic and Bourdoncle magic work on the call graph of the 
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oim_solve :- 

retract (redo (call , Atom)), !, (call_solve(Atom) ; oim_solve) . 
oim_solve :- 

retract (redo (answ, Atom)), !, (aiisw_solve(Atom) ; oim_solve) . 

call_solve(Head) :- 

head_to_clause(Head, Body), fact(call. Head, [] , Forml) , 
solve_right(Body, Forml, Form2) , cond_assert (answ, Head, Form2) , fail. 

Emsw_solve(Atom) :- 

atom_to_clause(Atom, Head, Left, Right), 

fact(call. Head, [] , Forml), fact(answ. Atom, Forml, Form2) , 
solve_lef t (Lef t , Form2, FormS) , solve_right (Right , Form3, Form4) , 
cond_assert (answ. Head, Form4) , fail. 

solve_lef t ( [] , Form, Form). 

solve_lef t ( [Atom I Atoms], Forml, FormS) :- 

fact(answ. Atom, Forml, Form2) , solve_left (Atoms , Form2, FormS). 

solve_right ( [] , Form, Form). 
solve_right ( [Atom I Atoms] , Forml , Form2) : - 
solve_right (Atom, Atoms, Forml, Form2) . 

solve_right (Atom, _, Form, _) :- 

cond_assert (call , Atom, Form), !, redo_assert (answ. Atom), fail. 
solve_right (Atom, Atoms, Forml, FormS) :- 

fact(answ. Atom, Forml, Form2) , solve_right (Atoms , Form2, FormS). 

Fig. 2. A Meta-interpreter for Ordered Induced Magic 



magic program, they cannot be combined with induced magic; the ordering of the 
re-evaluations conflicts. Calculating SCCs on the call graph of the original program 
may be combined with (ordered) induced magic. The order in which the calls are 
encountered is determined by the top-down left-to-right execution of the program 
and the evaluation of a call may add new answers to the redo list. SCCs can be 
used to order new answers as they are added to the redo list. This strategy is 
henceforth referred to as SCO induced magic. However, since calls are re-evaluated 
in preference to answers, the order of answers in the redo list is largely determined 
by the order of the calls. Consequently, SCCs should have a negligible effect on 
performance. 



5.3 Dynamic Dependency Tracking 

One test of the efficacy of an iteration strategy is the number of iterations required 
to reach the fixpoint. In order to assess how well ordered induced magic behaves, a 
more sophisticated iteration strategy based on dynamic dependency tracking was 
implemented. The strategy chosen was that of WRT solver of GENA (Fecht, 1997; 
Fecht & Seidl, 1999) since this recent work is particularly well described, has ex- 
tensive experimental results and conveniently fits with the redo list model. 
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sim_v5-2 
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cliat_80 
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39.3 


40.5 


41.5 


44.6 


35.4 


48.3 


41.0 


43.5 


join 


8.7 


8.7 


10.0 


6.4 


10.5 


8.0 


9.1 


8.7 


join (difT) 


1.0 


2.0 


0.1 


2.6 


0.2 


0.7 


1.8 


1.3 


equiv 


8.7 


8.7 


10.0 


6.4 


10.5 


8.0 


9.1 


8.7 


proj 


5.8 


4.7 


4.5 


7.1 


4.1 


4.0 


4.4 


4.2 


rename 


36.5 


35.4 


34.1 


33.0 


39.3 


31.0 


34.5 


33.6 


total 


6646 


11324 


5748 


3992 


12550 


11754 


32906 


109612 



Table 4. Frequency Analysis: Non-canonical Def Analyser with Ordered Induced 
Magic 
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Table 5. Frequency Analysis of Compaction in Projection (Ordered Induced Magic) 



The WRT strategy utilises a worklist, which is effectively reordered on-the-fly. 
To quote Fecht and Seidl (Fecht & Seidl, 1996), "The worklist now is organized as 
a (max) priority queue where the priority of an element [call pattern] is given by 
its time stamp," where the time stamp records the last time the solver was called 
for that call pattern. If, whilst solving for a call pattern, new call patterns are 
encountered, then the bottom answer pattern is not simply returned. Instead the 
solver tries to recursively compute a better approximation to this answer pattern. 
This tactic is also applied in PLAI and GAIA, though realised differently. 

The WRT strategy of GENA gives a small number of updates, hence is an 
attractive iteration strategy. However, its implementation in a backtrack driven 
meta-interpreter requires extensive use of the dynamic database for the auxiliary 
data-structures. In Prolog this is potentially expensive ( Hermenegildo et ai, 199^ ). 



5.4 Frequency Analysis for Def: Reprise 

In section 3 a frequency analysis of the abstract domain operations in De/-based 
groundness analysis was given. It was then argued that in light of these results 
certain choices about the abstract domain operations should be made. These results 
are dependent on the iteration strategy of the analyser. In this section several 
different iteration strategies have been proposed and it needs to be checked that 
these give similar proportions of calls to the abstract domain operations - that is, 
that the choices for the abstract domain operations remain justified. Table 4 gives 
the frequency analysis for ordered induced magic driving non-canonical Def and 
indicates that the choices of domain operation remain valid. Note that for the BDD 
analyser, each rename is accompanied by a projection - this is not the case for 
non-canonical Def, explaining the lesser frequency of projection. This makes the 
non-canonical Def representation appear even more suitable. Table 5 demonstrates 
that projection still almost always avoids compaction. Similar distributions are 
found with the other iteration strategies and for brevity these tables are omitted. 
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6 Experimental Evaluation 

This section gives experimental results for a number of analysers with the objective 
of comparing the analysis proposed in the previous sections with existing techniques 
and evaluating the impact of the various tactics utilised. These analysers are built 
by selecting appropriate combinations of: abstract domain, domain representation, 
iteration strategy and optimisations. The analysers are evaluated in terms of both 
execution time and the underlying behaviour (i. e. the number of updates). All 
implementations are coded in SICStus Prolog 3.8.3 with the exception of the domain 
operations for Pos, which were written in C by Schachte ( Schachte, 199S| ). The 



analysers were run on a 296MHz Sun UltraSPARC-II with 1GByte of RAM running 
Solaris 2.6. Programs are abstracted following the elegant (two program) scheme of 



(Bueno et ai, 1996|) to guarantee correctness. Programs are normalised to definite 



clauses. Timings are the arithmetic mean over 10 runs. Timeouts were set at five 
minutes. 



6.1 Domains: Timings and Precision 

Tables 6 and 7 give timing and precision results for the domains EPos, Def rep- 
resented in DBCF, non-canonical Def (denoted GEP after the representation) and 
Pos. In these tables, file is the name of the program analysed; size is the number 
of abstract clauses in the normalised program; abs is the time taken to read, parse 
and normalise the input file, producing the abstract program; fixpoint details the 
analysis time for the various domains; precision gives the total number of ground 
arguments in the call and answer patterns found by each analysis (excluding those 
introduced by normalising the program); % prec. loss gives the loss of precision of 
EPos and Def as compared to Pos - to emphasise where precision is lost, entries 
are only made when there is a precision loss. All the analyses were driven by the 
ordered induced magic iteration strategy. 

First consider precision. As is well known, in practice, for goal-dependent ground- 
ness analysis, the precision of Def is very close to that of Pos. In the benchmark 
suite used here, Def loses ground arguments in only two programs: rotate.pl, which 
loses three arguments, and sim_v5-2.pl, where two arguments are lost. EPos loses 
precision in several programs, but still performs reasonably well. (Goal-independent 



analysis precision comparisons for EPos and Def are given in ( Hcaton et ai, 2000 ) 



and ( iGenaim fc Codish, 2001 ). These show that EPos loses significant precision. 



whereas Def gives precision close to that of Pos.) 

The non-canonical Def analyser appears to be fast and scalable - taking more 
than a second to analyse only the largest benchmark program. This analyser does 
not employ widening (however, incorporating a widening would guarantee robust- 



ness of the analyser, even for pathological programs ( Genaini et al, 2001 )). Notice 
that the analysis times for all the programs is close to the abstraction time - this 
suggests that a large speed up in the analysis time needs to be coupled with a 
commensurate speedup in the abstracter. 

The non-canonical Def analysis times are comparable to those for EPos for 
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Table 7. Groundness Results: Larger Programs 



smaller programs, with EPos outperforming non-canonical Def on some of the 
larger benchmarks. This is unsurprising given the much better theoretical behaviour 
of EPos, indeed it is much in the favour of non-canonical Def that it is competitive 
with EPos. The DBCF analyser suffers from the problems discussed in section 4. 
The cost in meet of maintaining the canonical form often becomes significant. In 
cases (such as in music.pl) where the number of variables, the number of body atoms 
and the size of the representation are all large, the exponential nature of reducing to 
canonical form leads to a massive blowup in analysis time. Hence the DBCF anal- 
yser fails to produce a result for several examples and gives poor scalability. Also, 
the analysis appears to lack robustness - the sensitivity of the meet to the form 
of the program clauses leads to widely varying results. Pos performs well on most 
programs, but is still consistently several times slower than non-canonical Def. Pos 
performs particularly poorly on parity.pl (a program designed to be problematic 
for BDD-based Pos analysers) and aqua_c.pl. Again, since the Pos analyser uses 
BDDs (essentially a canonical form) there is a cost in maintaining the representa- 
tion. This can lead to a lack of robustness. It should be pointed out that the Pos 
analyser is not state of the art and that one using the GER representation (Bagnara 
fc Schachte, 1999 ) would probably give improved results. Of course, widening could 
be used to give improved times for Pos, but at the cost of precision. 



6.2 Iteration Strategy: Timings and Updates 

Table 8 gives timing results for non-canonical Def analysis when driven by various 
iteration strategies. The column headers are abbreviations as follows: ord stand for 
ordered induced magic; eim stands for eager induced magic; bom stands for Bour- 
doncle magic; scm stands for SCC magic; sec stands for SCC induced magic; dyd 
stands for dynamic dependency. The timings are split into two sections. The over- 
head time is the preprocessing overhead incurred in calculating the SCCs required 
to drive the analyses. For bom and scm, SCCs are calculated on the call and answer 
graph of the magic program. For sec, SCCs are calculated on the call graph of the 
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Table 8. Timing Results for Iteration Strategies 

original program. The strategies ord, eim and dyd do not require any preprocess- 
ing, hence have no overhead. The strategy times are the times for analysing each 
program (that is, the time taken for the fixpoint calculation, not including the pre- 
processing overhead). Table 9 gives a second measure of the cost of each iteration 
strategy; this time in terms of the number of updates (writes to database/extension 
table) required to reach the fixpoint. 

One important measure of the success of an iteration strategy is the number of 
updates required in the analysis. This impacts directly on the number of calls to 
abstract operations and hence the amount of work (speed) of the analysis. Table 
9 indicates that ord, sec and dyd give the best behaviour over a large number of 
programs. However, all of the other strategies give the best result for some programs, 
indicating that each has its merits. Observe that, as predicted in section 5, ord and 
sec give very similar results. 

In measuring performance of a particular analysis, the overall time taken is also 
of importance. Table 8 indicates that the methods based on SCCs in the call graph 
of the magic program have problems. Firstly, they require SCCs to be calculated 
- the cost of this (in particular for Bourdoncle magic) is significant. Secondly, 
the fixpoint times for bom and scm are much greater than would be expected 
from the results in Table 9. This is partly because the bom and scm strategies 
cannot be integrated with induced magic, which impacts heavily on speed. The 
bom strategy also has a third drawback the proportion of re-evaluations not 
resulting in an update rises dramatically for larger programs. Larger programs 
often give rise to deeply nested SCCs. Suppose an SCC, say A, nests a subSCC, 
say B. In detecting the stability of A, the stability of the head of B needs to 
be established. This in turn requires a single pass over B. If n passes over A are 
required to reach stability, then n passes over B are also needed (even if B is already 
stable). Extrapolating, the number of times an SCC is passed over is determined 
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Table 9. Number of Updates for Iteration Strategies 



by the sum of the number of passes over each SCC containing it. If the SCC is 
deeply nested and large this involves a large number of re-evaluations producing no 
updates. As the scm strategy does not involve nested SCCs, this problem does not 
arise. It appears that Bourdoncle's recursive strategy is not well suited for driving 
groundness analyses of logic programs. Table 8 also indicates that whilst SCCs 
on the call graph give comparable analysis times to ordered induced magic, they 
too come with an overhead of precomputation. Sophisticated dynamic dependency 
graphs do not pay for themselves in a groundness analysis involving lightweight 
domain operations, as reflected by the timings for dyd. However, they are more 
amenable to optimisation than ordered induced magic (which is itself essentially an 
optimisation of induced magic) and in an analysis where the cost of the abstract 
operations is higher it is to be expected that this strategy would be more effective. 
Also, by using a different programming paradigm, the dynamic changes to the 
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Table 10. Chain Length Distributions 



dependency graph could be made more efficiently (for example, (Fecht & SeidL 



1999) use SML). 



6.3 Chain Length 

Table 10 gives further details of the number of updates required in program analysis 
with non-canonical Def. This table gives the distribution of the number of updates 
required to reach the fixpoint for the various program predicates. Results are given 
for ord and dyd as it is clear from Table 9 that these are the most competitive 
strategies. Each column gives the number of predicates requiring that number of 
updates. Entries beyond the maximum number of updates have been left blank to 
highlight the maximum chain length. 

Chain length gives a good indication of the robustness of the iteration strategies. 
Whilst it is always possible to construct programs exhibiting worst case behaviour 
( Codish, 1999b| ; Genaim et at, 2001 ), Table 10 shows that for both ord and dyd, 
very few chains are longer than 4 and that at worst chains have length 9. It also 
again indicates that different strategies can give significantly different behaviour for 
the analysis. 



6.4 Optimisations 

A number of optimisations have been discussed in this paper. Table 11 details the 
effect of these, singly and in combination. The five optimisations considered have 
each been abbreviated by a single letter: e denotes filtering by entailment checking; 
g denotes the use of a GEP factorisation; p denotes filtering projection; r denotes 
the use of redundancy removal; t denotes the maintenance of a true factorisation. 
The column headers describe which optimisations have been switched on; for exam- 
ple, gpr denotes the situation where the analysis uses a GEP factorisation, where 
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Table 11. Timing Results for Combinations of Optimisations 

projection is filtered and where redundancy removal is used, but the factorisation 
is not true and the entailment checking filter for join is not apphed. Note that the 
switch for the entailment checking does not entirely turn off the entailment check 
filter for join, as the Def analysers enforce termination using the same entailment 
check which filters join. In Proposition 2, the filtering of join has three cases; the 
entailment check switch turns the first (most lightweight) case on and off. The de- 
fault for the non-canonical Def analyser which has been used for other timings in 
this paper is egpr, since this gives the best result for most programs. 

The first three columns of Table 11 all give very similar times, indicating that true 
factorisation and redundancy removal have little effect on analysis times, essentially 
paying for themselves. The next three columns give times for the situation with one 
of e, g, p switches off (relative to the default case). It is clear that turning off any 
of these optimisation gives a slow down of, perhaps, 10%. The next three columns 
give results for switching off optimisations in pairs. Again there is a clear slowdown 
from the previous three results (although notice that the epr and gr results are very 
similar), a slowdown of 15-20% from the default case. Finally, the last column shows 
that switching off all the optimisations results in a slowdown of approximately 25% 
in most programs. 

One conclusion to be drawn from Table 11 is that the non-canonical Def analysis 
is extremely robust. By turning off all the optimisations for both the size of repre- 
sentation and the efficiency of the abstract operations, the analysis is still fast. It 
is expected that the effect of turning off these optimisations would be bigger when 
using a less effective iteration strategy or a less suitable (orthogonal) representation. 



7 Related Work 

Van Hentenryck et al. (Van Hentenryck et al.^ 1995|) is an early work which laid a 
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foundation for BDD-based Pos analysis. Corsini et al. ( Corsini et al., 1993| ) describe 
how variants of Pos can be implemented using Toupie, a constraint language based 
on the /i-calculus. If this analyser was extended with, say, magic sets, it might lead 
to a very respectable goal-dependent analysis. More recently, Bagnara and Schachte 
( Bagnara fc Schachte, 1999| ) have developed the idea ( Bagnara, 1996| ) that a fac- 
torised implementation of ROBDDs which keeps definite information separately 
from dependency information is more efficient than keeping the two together. This 
hybrid representation can significantly decrease the size of an ROBDD and thus is 
a useful implementation tactic. 

Heaton et al. ( Hcaton et al., 2000 ) propose EPos, a sub-domain of Def, that 
can only propagate dependencies of the form {xi ^ X2) /\ X3 across procedure 
boundaries. This information is precisely that contained in one of the fields of the 
GEP factorised domain. The main finding of ( Hcaton et ai, 200C ) is that this 
sub-domain performs reasonably well for goal-dependent analysis. 

Armstrong et al. (Armstrong et al, 1998) study a number of different represen- 
tations of Boolean functions for both Def and Pos. An empirical evaluation on 15 
programs suggests that specialising Dual Blake Canonical Form (DBCF) for Def 
leads to the fastest analysis overall. Armstrong et al. (Armstrong et al., 1998) also 
perform interesting precision experiments. Def and Pos are compared, however, in 
a bottom-up framework that is based on condensing and is therefore biased towards 
Pos. The authors point out that a top-down analyser would improve the precision 
of Def relative to Pos. 

Garcia de la Banda et al. (Garcia de la Banda et al., 1996) describe a Prolog 
implementation of Def that is also based on an orthogonal DBCF representation 
(though this is not explicitly stated) and show that it is viable for some medium 
sized benchmarks. Fecht and Seidl ( Fecht, 1997 ; Fecht fc Seidl, 1999 ) describe an- 
other groundness analyser for Pos that is not coded in C. They adopt SML as 
a coding medium in order to build an analyser that is declarative and easy to 
maintain. Their analyser employs a widening. 

Codish and Demoen ( Codish fc Demoen, 1995 ) describe a non-ground model 
based implementation technique for Pos that would encode xi ^ {x2 A 2:3) as three 
tuples (true, true, true) , {false,-, false), {false, false,-). King et al. show how, for 
Def , meet, join and projection can be implemented with quadratic operations based 
on a Sharing quotient (King et al., 1999). Def functions are essentially represented 
as a set of models and widening is thus required to keep the size of the representation 
manageable. Ideally, however, it would be better to avoid widening by, say, using a 
more compact representation. 

Most recently, Genaim and Codish (Genaim & Codish, 2001) propose a dual 
representation for Def. For function /, the models of coneg{f) are named and / is 
represented by a tuple recording for each variable of / which of these models the 
variable is in. For example, the models of coneg{x — > y) are {{x, y}, {x}, 0}. Naming 
the three models a, b, c respectively, / is represented by {ab, a). This representation 
cleverly allows the well known ACIl unification theory to be used for the domain 
operations. (Genaim & Codish, 2001) report promising experimental results, but 
still need a widening to analyse the aqua_c benchmark. 
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8 Conclusion 

By considering the way in which goal-dependent groundness analyses proceed, an 
intelligent choice can be made as to how to represent the abstract domain and how 
the cost of the domain operations should be balanced. Analysing the relative fre- 
quencies of the domain operations leads to a representation which is compact, and 
where the most commonly called domain operations are the most lightweight. Filters 
for the more expensive domain operations are described which allow these opera- 
tions to be calculated by inexpensive special cases. Ways in which a non-ground 
representation for Boolean functions may exploit the language features of Prolog to 
obtain an efficient implementation are described. The iteration strategy for driving 
an analysis is also extremely important. Several strategies are discussed and com- 
pared. It is concluded that for groundness analysis the fastest implementation \iscs 
a simple strategy avoiding precomputation and sophisticated data-structures. An 
implementor might find some or all of the issues discussed and ideas raised in this 
paper useful in designing a program analysis and in implementing it in Prolog. 

The end product of this work is a highly principled goal-dependent groundness 
analyser combining the techniques described. It is written in Prolog and is small 
and easily maintained. The analyser is a robust, fast, precise and scalable and does 
not require widening for the largest program in the benchmark suite. Experimental 
results show that the speed of the fixpoint calculation is very close to that of reading, 
parsing and normalising the input file. Results also suggest that the performance of 
the analyser compares well with other groundness analysers, including BDD-based 
analysers written in C. 
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